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Abstract 

The common practice for displaying error bars on distributions of numbers of 
events is confusing and can lead to incorrect conclusions. A proposal is made 
for a different style of presentation that more directly indicates the level of 
agreement between expectations and observations. 

1 Introduction 

Symmetric error bars centered on the observed number of events can be highly misleading and in practice 
often generates confusion. A typical exampleQof such a data presentation is given in Fig. [I] 
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Fig. 1: A typical presentation of data - here event counts as a function of mass in 25 GeV intervals. The same data 
are plotted twice: left - linear scale; right - logarithmic scale. The error bars cover the range (o — s/o, o + \fo), 
where o is the number of observed events. The histogram gives the expectations (means of Poisson distributions). 

There are three problems with this standard style of presentation: 

- First of all, there is no uncertainty on the number of observed events. We certainly do not mean that 
there is a high probability that we had 2.3 rather than 2 events in the 7 th bin in the plot. Actually, 
the error bar is intended to represent the uncertainty on a different quantity-the uncertainty on the 

1 The data is artificial and invented for the purposes of this note. 



mean of an assumed underlying Poisson distribution. The probability distribution for this mean 
given an observed number of events o, P{9\6), can be quite asymmetric, and different choices can 
be made regarding what to plot as summary (e.g., the mode, the mean or the median) of P(8\o). 

- The second problem arises with the length of the error bar. This is routinely taken as iy^ , 
motivated by the Poisson result that the variance is equal to the mean, so that the error bar should 
cover ±1 a, or 68 % probability for possible values of 9. However, the probability range covered 
by this definition of the error bar varies dramatically as o — > and the probability above and 
below the point is highly asymmetric. Usually no error bar is plotted when events are measured, 
although this measurement also yields information on possible values of 9. These problems are 
occasionally avoided by using asymmetric error bars, usually covering the central 68 % of the 
probability from the cumulative of P(9\o), but this is still the exception rather than the rule in 
experimental particle physics. 

- A third problem occurs when data are compared to expectations, as in Fig. [T] and the error bar 
is used to determine if the observed number of events represents a significant deviation from the 
expectation. The error bar on the plot often gives the completely wrong information in this case, 
since the relevant probability is the probability that the expectation could have yielded the number 
of observed events, not the probability that the observed number of events could have fluctuated 
to the expectation. For example, in the next-to-last bin in Fig. [T] the expectation is 0.011 and 
two events are observed. It is VERY wrong to conclude that we have slightly more than a 1 a 
discrepancy in this bin. 

A proposal for an alternative presentation is given here. We focus on the case where we are 
comparing observations to predictions and the fluctuations can be modeled with a Poisson distribution. 
We start with the simplest case - that predictions are available with negligible uncertainty. We then show 
how to include the uncertainty due to predictions based on finite Monte Carlo event sets and due to 
systematic uncertainties. 



2 Negligible uncertainty on the prediction 

We start with the simplest case - the expectations (means of Poisson distributions) are known with very 
small uncertainty, and we want to compare the observed data to these expectations. The plot should 
give us an indication of whether the observations are within reasonable statistical fluctuations of the 
expectation; i.e., the plot should give the user an indication of how rare a particular observed number of 
events o was expected to be given a Poisson distribution with mean number of events v. The probability 
distribution for o is 

P(o\u) = e —— . (1) 
o! 

The most probable value for o (mode of the probability distribution) is given by 

o* = \y\ ■ (2) 

i.e., the largest integer not greater than v. 

Two choices can be made for the probability intervals to display: 

1. The central interval, defined for a probability density P{x) with possible range for the parameter 

{^miru ^max} 

a/2 = / P{x)dx = I P{x)dx . 

JXmin J X2 

The central interval is {x\,X2\ and contains probability 1 — a. In our case, we have a discrete 
probability distribution P(o\v) and the equality generally cannot be satisfied. We take instead 

o 

oi = sup{VP(i|^) < a/2} + 1 (3) 
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oo 

o 2 = inf-Q^P(z|z/) < a/2}- 1 . (4) 

i=o 

If P( = 0\u) > a/2, then we take o 1 = 0. 

We define the set of observations which fall into the central 1 — a probability intervaQas 

0^_ a = {o 1 ,o 1 + l,...,o 2 } 

and we display these values of o. Different colors can be used to represent different 1 — a prob- 
ability ranges. For example, if we take v = 3.3, then we find the values given in Table [I] If we 
choose 1 — a = 0.68, then our definitions give 

O c 68 = {2,3,4,5} 

whereas 

O c 95 = {0,l,2,3,4,5,6,7} 

and 

0a999 = {O,l,2,3,4,5,6,7,8,9,lO,ll} . 



Table 1: Values of o, the probability to observe such a value given v = 3.3 and the cumulative probability, rounded 
to four decimal places. The fourth column gives the rank in terms of probability - i.e., the order in which this value 
of o is used in calculating the smallest set Of_ a , and the last column gives the cumulative probability summed 
according to the rank. 



o 


P(p\v) 


F{p\u) 


R 


Fr{o\u) 





0.0357 


0.0357 


1 


0.9468 


1 


0.1189 


0.1546 


5 


0.8431 


2 


0.1982 


0.3528 


2 


0.4184 


3 


0.2202 


0.5730 


1 


0.2202 


4 


0.1835 


0.7565 


3 


0.6019 


5 


0.1223 


0.8788 


4 


0.7242 


6 


0.0680 


0.9468 


6 


0.9111 


7 


0.0324 


0.9792 


8 


0.9792 


8 


0.0135 


0.9927 


9 


0.9927 


9 


0.0050 


0.9976 


10 


0.9976 


10 


0.0017 


0.9993 


11 


0.9993 


11 


0.0005 


0.9998 


12 


0.9998 


12 


0.0001 


1.0000 


13 


1.0000 



2. The second option for the probability interval is to use the smallest interval containing a given 
probability. In the case of a unimodal continuous distribution, we can write the condition as 

1 — a = / P(x)dx and P(x\) = P{x2) ■ 

J Xl 

For our discrete case, the set making up the smallest interval containing probability at least I — a, 
Of_ a , is defined by the following algorithm 

(a) Start with Of_ a = {o*}. If P{o*\v) > 1 — a, then we are done. An example where this 
requirement is fulfilled for 1 - a = 0.68 is {v = 0.001, o* = 0, C^ 68 = {0}). 

2 Note that 1 — a is the minimum probability covered and that the set generally covers a larger probability. 
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(b) If P(o*\u) < 1 — a, then we need to add the next most probable number of observations, 
which in the unimodal case is either o* + l oro* — 1. Assume that P{o* + l\v) > P(o* — l\v). 
Then, we would extend our set to Of_ a = {o* , o* + 1} and check again whether P{Of_ a ) > 
I — a. We would continue to add members to the set Of_ a until this condition is met, always 
taking the highest probability of the remaining possible observations. 

In the special cases that two observations have exactly the same probability (when v takes on 
an integer value, P(o = v) = P(o = v — 1)), then both values should be taken in the set. 

If we consider the example given in Table [T] then using the cumulative according to rank, Fr, we 
find 

O S 68 = {2,3,4,5} 

whereas 

O S 95 = {0,1,2,3,4,5,6,7} 

and 

^0.999 = {0,1,2,3,4,5,6,7,8,9,10} . 
The same sets are found as for the central interval for 1 — a = 0.68 and 0.95, but is smaller for 
0.999. 

If the predicted distribution has a large mean {y > 50, say), then we can use the Gaussian approx- 
imation. In this case, we take the minimal symmetric range around o* such that 

o* + (n+0.5) 1 (x _ o . )2 

e 2^ dx > 1 — a 

_*-(n+0.5) yl-KV 

and define O a = {o* — n, o* +n}. This gives both the central interval as well as the minimal interval. 

As an example for the procedures defined here, Fig. [2] shows the same distribution of observed 
number of events as a function of invariant mass as was shown in Fig. [T] Three different probability 
intervals are shown, corresponding to 1 — a = 0.68, 1 — a = 0.95, and 1 — a = 0.999 and follow the 
definition of the smallest interval. Note that the bands are extended beyond the integer values in the set 
by 0.5 for clarity of presentation. E.g., the band containing the set {2, 3, 4, 5} is drawn from 1.5 — > 5.5. 
The smallest 1 — a color is chosen if more than one 1 — a set contain the same set members (e.g., this 
occurs when the set {0} contains > 95 % probability as in the last bins in the plot). The color scheme is 
meant to be suggestive. Observed event counts outside the shaded bands should indicate unlikely results. 



3 Predictions based on Monte Carlo with finite statistics 

We now extend the prescription for cases where the prediction is uncertain. The first case we consider 
is that the prediction is based on a Monte Carlo which has non-negligible statistical uncertainties. We 
derive the probability distribution for the expected number of events o given that a MC set (consisting 
possibly of different components) gives n events and the MC normalization factor is s (defined such that 
the MC prediction is divided by the factor s when calculating the expected mean for the data). The factor 
s is initially taken to be known exactly. 

The process of assigning Monte Carlo events to bins indicates the use of the multinomial distribu- 
tion for calculations (if we generate a fixed number of events rather than a fixed luminosity). We assume 
here that the probability for an event to populate any given bin is small so that Poisson statistics can be 
used as a valid approximation. Assume our Monte Carlo sample has resulted in n events (in a bin of in- 
terest). We now need to determine the expected mean for the data sample and the probability distribution 
for this mean. We use A for the mean of the MC distribution in the bin, and u for the mean expected for 
the observations. Applying Bayes' Theorem [1] and taking the Jeffreys' prior |2j on the mean for the 
MC, A, 

Po(A)oc ' 
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Fig. 2: The proposed style of presentation for the same data as shown in Fig. [T] The shaded bands represent 
different probability intervals for the observed number of events: green is for 1 — a = 0.68; yellow is for 1 — a = 
0.95 and red is for 1 — a = 0.999. The bands use the smallest interval containing at least this probability, and 
extend to the next 1/2 integer value (see text). For the logarithmic scale plot on the right, the data are shown as a 
solid triangle if o = 0. 



the pdf for A is 



which leads to 



P(X\n) 



\ n n\ 
V5F(2n)! 



A „-l/2 e -A 



E[A] = ra + 1/2 and 
A* = n-1/2 . 

For scaling the prediction to the mean expected for the data (y = A/s), we have: 

P(u\n,s) = P{\\n)d\/du 



A n n\ 
/vr(2n) 



-s(sv) 



n—X/2—sv 



For the distribution of data events, we use the Law of Total Probability (3j : 

P(o\n, s) = J P(p\v)P(y \n, s)dv 
s «+ 1 / 2 4«n! 



P(o\n, s) 



The integral gives 



[2(o + n)]!v^ 



o + n-l/2 e -(s+l)u dv 



4 n +°(l + s)° +n+1 / 2 {n + o)\ 
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so 



and 



P(o\n, s) 



,n+l/2 



n\[2(o + n)}\ 



(1 + s y+n+i/2 4o !(2 n )!( n + G )| 



(5) 



E[o] 



n + 1/2 



For o = 0, we have 

/ s \ n+1 /2 

P( o = 0|n,a) = f — J . (6) 

The probability for the succeeding values of o can then be easily calculated as 

(2n + 2o + 2)(2n + 2o + l) 



P(o + l|n, s) = P(o|n, s) 



4(o+l)(l + s)(ra + o + l) 



We would now use these probabilities of o for the procedure described in the previous section 
rather than the Poisson distribution for P(o\u). 



Table 2: Values of o, the probability to observe such a value given n — 10 and s = 3 and the cumulative 
probability, rounded to four decimal places. The fourth column gives the rank in terms of probability - i.e., the 
order in which this value of o is used in calculating the smallest set Of_ a , and the last column gives the cumulative 
probability summed according to the rank. 



o 


P(o\n, s) 


F(o \n, s) 


R 


F R (o\n,s) 





0.0488 


0.0488 


1 


0.9072 


1 


0.1280 


0.1768 


4 


0.6654 


2 


0.1840 


0.3608 


2 


0.3757 


3 


0.1917 


0.5525 


1 


0.1917 


4 


0.1617 


0.7142 


3 


0.5374 


5 


0.1173 


0.8315 


5 


0.7827 


6 


0.0757 


0.9072 


6 


0.8584 


7 


0.0446 


0.9519 


8 


0.9519 


8 


0.0244 


0.9763 


9 


0.9763 


9 


0.0125 


0.9888 


10 


0.9888 


10 


0.0061 


0.9949 


11 


0.9949 


11 


0.0028 


0.9978 


12 


0.9978 


12 


0.0013 


0.9991 


13 


0.9991 


13 


0.0006 


0.9996 


14 


0.9996 


14 


0.0002 


0.9998 


15 


0.9998 


15 


0.0001 


0.9999 


16 


0.9999 



An example of the effect of finite Monte Carlo statistics on P(o) is given in Fig. [5] As is clear, if 
the Monte Carlo used to derive the prediction for the number of events has an effective luminosity only a 
factor 3 larger than the data, significant differences can result in probabilities for observed events. If we 
consider the example given in Table|2] where n = 10 and s = 3, the mean value of v is v = 3.5 and the 
finite MC statistics gives slightly different results for our sets: 

O c 68 = {1, 2, 3, 4, 5, 6} O s 68 = {1, 2, 3, 4, 5} 
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O c 95 = {0,l,2,...,8} O s 95 = {0,l,2,...,7} 

O c 999 = {0, 1, 2, 13} O s 999 = {0, 1, 2, 12} . 

As expected, the set of possible values of o has increased for a given 1 — a probability due to the extra 
uncertainty introduced by the finite number of Monte Carlo counts. 
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Fig. 3: Comparison of the distributions P(o\v = 10/3) and P(o\n — 10, s = 3.). 



4 Prediction with systematic uncertainties 

If the predictions have systematic uncertainties, then this can also be taken into account in defining the 
probabilities for the observed number of events. The probability density for v will now depend on extra 
quantities, and we write 

P(v\n,s,9) 

where 9 is a set of nuisance parameters used to describe the systematic uncertainties (e.g., energy scale 
parameters). It may be possible to determine P{y\n, s, 9) directly from Monte Carlo simulations where 
also the systematically uncertain quantities are varied according to their belief distributions. In this case, 
we would use this information and have: 



P(o\n,s,9) = J P(o\u)P(u\n,s,9)du 



In general, this integral will need to be solved numerically and the P(o\n, s, 9) then input into the pre- 
scriptions above. 

In many cases, we have a fixed number of MC events and the same events are used repeatedly with 
different assumptions for the uncertain quantities, so that the systematic uncertainty is on the scaling 
parameter s. In this case, we can write 



P(u\n,9) = j P{v\n,s)P{s\9)ds 



where the systematic uncertainty appears as a pdf for the scale factor. The probability distribution for o 
is now 

dv . 



P(o\n,9) = J P{o\u) J P(v\n,s)P{s\9)ds 
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Often, we assume the belief in values for the scale factor s can be modeled as a Gaussian: 



I _(s-s ) 2 

P{s\s ,a s ) = -=^e 

In this case, we would have 



P{o\n, s ,a s ) = 

This looks rather forbidding but can be solved numerically. Taking our standard example, we now add 
to the MC statistical uncertainty a 30 % systematic uncertainty in the scale factor s and nncQthe results 
given in Table[3] A graphical presentation of the effect on P(o) of including systematic uncertainties on 
s is shown in Fig. [4j 

Table 3: Values of o, the probability to observe such a value given n = 10 and s = 3 and 30 % systematic 
uncertainty on s, together with the cumulative probability, rounded to four decimal places. The fourth column 
gives the rank in terms of probability - i.e., the order in which this value of o is used in calculating the smallest set 
Of_ a , and the last column gives the cumulative probability summed according to the rank. 






P(o\n, s) 


F(o\n, s) 


R 


F R (o\n,s) 





0.0539 


0.0539 


1 


0.8492 


1 


0.1272 


0.1811 


4 


0. 6111 


2 


0.1699 


0.3511 


2 


0.3403 


3 


0.1703 


0.5214 


1 


0.1703 


4 


0.1436 


0.6650 


3 


0.4839 


5 


0.1083 


0.7733 


5 


0.7194 


6 


0.0759 


0.8492 


6 


0.7953 


7 


0.0509 


0.9001 


8 


0.9001 


8 


0.0332 


0.9333 


9 


0.9333 


9 


0.0214 


0.9547 


10 


0.9547 


10 


0.0138 


0.9685 


11 


0.9685 


11 


0.0090 


0.9776 


12 


0.9776 


12 


0.0060 


0.9835 


13 


0.9835 


13 


0.0040 


0.9876 


14 


0.9876 


14 


0.0028 


0.9904 


15 


0.9904 



The elements of our different sets are now given by 

0$ m = {1,2,3,4,5} 
O| 95 = {0,l,2,...,9} 
O s 999 = {0,l,2,...,33} . 

The 68 % interval is now different between the two definitions, and very large values of o are 
allowed at probability 1 — a = 0.999, in particular for the central interval. 

3 note that with such a large systematic variation, the Gaussian distribution in Eq.|7]is truncated and is renormalized, leading 
to a shift in E[o]. 



4 n n! 



vr(2n) 



-1/2, 



ds 



dv 



(7) 



O c 68 = {1,2,3,4,5,6} 
0£ 95 = {O,1,2,...,11} 
O c 999 = {0,l,2,...,44} 
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Fig. 4: Comparison of the distribution P(o\n = 10, sq = 3, a s = 0.3so) with P{o\v = 10/3) (top) and P(o\v 
10/3, a s = 0.3s ) to P[p\v = 10/3) (bottom). 



5 Summary 

We have described an alternative presentation of data for cases where the aim is to judge whether an 
observed number of events is consistent with the model predictions. We believe this style of presentation 
is more appropriate than the common one, where error bars are placed on the observed number of events. 
Example code snippets for the examples described here can be requested from the authors. 
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